Likelihood normalization using an ergodic HMM for continuous speech recognition

نویسنده

  • Kazuhiko Ozeki
چکیده

In recent speech recognition technology, the score of a hypothesis is often de ned on the basis of HMM likelihood. As is well known, however, direct use of the likelihood as a scoring function causes di cult problems especially when the length of a speech segment varies depending on the hypothesis as in word-spotting, and some kind of normalization is indispensable. In this paper, a new method of likelihood normalization using an ergodic HMM is presented, and its performance is compared with those of conventional ones. The comparison is made from three points of view: recognition rate, word-end detection power, and the mean hypothesis length. It is concluded that the proposed method gives the best overall performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Voice quality normalization in an utterance for robust ASR

In this paper, we propose a novel method of normalizing the voice quality in an utterance for both clean speech and speech contaminated by noise. The normalization method is applied to the N-best hypotheses from an HMM-based classifier, then an SM (Sub-space Method)-based verifier tests the hypotheses after normalizing the monophone scores together with the HMMbased likelihood score. The HMM-SM...

متن کامل

Performance Evaluation of Statistical Approaches for Text Independent Speaker Recognition Using Source Feature

This paper introduces the performance evaluation of statistical approaches for Text-Independent speaker recognition system using source feature. Linear prediction (LP) residual is used as a representation of excitation information in speech. The speaker-specific information in the excitation of voiced speech is captured using statistical approaches such as Gaussian Mixture Models (GMMs) and Hid...

متن کامل

Irrelevant variability normalization based HMM training using VTS approximation of an explicit model of environmental distortions

In a traditional HMM compensation approach to robust speech recognition that uses Vector Taylor Series (VTS) approximation of an explicit model of environmental distortions, the set of generic HMMs are typically trained from “clean” speech only. In this paper, we present a maximum likelihood approach to training generic HMMs from both “clean” and “corrupted” speech based on the concept of irrel...

متن کامل

Glottal Excitation Feature based Gender Identification System using Ergodic HMM

In this paper, through different experimental studies it is demonstrated that the time varying glottal excitation component of speech can be exploited for text independent gender recognition studies. Linear prediction (LP) residual is used as a representation of excitation information in speech. The gender-specific information in the excitation of voiced speech is captured using the Hidden Mark...

متن کامل

Linear Transforms in Automatic Speech Recognition: Estimation Procedures and Integration of Diverse Acoustic Data

Linear transforms have been used extensively for both training and adaptation of Hidden Markov Model (HMM) based automatic speech recognition (ASR) systems. Two important applications of linear transforms in acoustic modeling are the decorrelation of the feature vector and the constrained adaptation of the acoustic models to the speaker, the channel, and the task. Our focus in the first part of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996